Leveraging supplemental representations for sequential transduction

نویسندگان

  • Aditya Bhargava
  • Grzegorz Kondrak
چکیده

Sequential transduction tasks, such as grapheme-to-phoneme conversion and machine transliteration, are usually addressed by inducing models from sets of input-output pairs. Supplemental representations offer valuable additional information, but incorporating that information is not straightforward. We apply a unified reranking approach to both grapheme-to-phoneme conversion and machine transliteration demonstrating substantial accuracy improvements by utilizing heterogeneous transliterations and transcriptions of the input word. We describe several experiments that involve a variety of supplemental data and two state-of-the-art transduction systems, yielding error rate reductions ranging from 12% to 43%. We further apply our approach to system combination, with error rate reductions between 4% and 9%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentence Compression with Joint Structural Inference

Sentence compression techniques often assemble output sentences using fragments of lexical sequences such as ngrams or units of syntactic structure such as edges from a dependency tree representation. We present a novel approach for discriminative sentence compression that unifies these notions and jointly produces sequential and syntactic representations for output text, leveraging a compact i...

متن کامل

Toward Neurally-Inspired Computational Models of Narrative∗

In the spirit of the neuroscience theme of this year’s meeting, I will describe a set of cognitive and neurophysiological phenomena that are important for the processing of narrative text at the discourse level. Text processing depends on sequential structure in language and also in the events that language describes. Semantic representations of events capture perceptual and motor properties of...

متن کامل

The Incremental Design of Parallel Compiler Intermediate Representations using SPIRE

SPIRE is the first incremental methodology for designing the intermediate representations (IR) of compilers that target parallel programming languages. Its core philosophy is to extend in a systematic manner the IRs found in the compilation frameworks of sequential languages. Avoiding the often-used ad-hoc approach of encoding all parallel constructs as “fake” function calls, SPIRE enables the ...

متن کامل

Multiple Many-to-Many Sequence Alignment for Combining String-Valued Variables: A G2P Experiment

We investigate multiple many-to-many alignments as a primary step in integrating supplemental information strings in string transduction. Besides outlining DP based solutions to the multiple alignment problem, we detail an approximation of the problem in terms of multiple sequence segmentations satisfying a coupling constraint. We apply our approach to boosting baseline G2P systems using homoge...

متن کامل

Sequence Transduction with Recurrent Neural Networks

Many machine learning tasks can be expressed as the transformation—or transduction—of input sequences into output sequences: speech recognition, machine translation, protein secondary structure prediction and text-to-speech to name but a few. One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012